17 research outputs found

    Textual analysis of artificial intelligence manuscripts reveals features associated with peer review outcome

    Get PDF
    We analyzed a data set of scientific manuscripts that were submitted to various conferences in artificial intelligence. We performed a combination of semantic, lexical, and psycholinguistic analyses of the full text of the manuscripts and compared them with the outcome of the peer review process. We found that accepted manuscripts scored lower than rejected manuscripts on two indicators of readability, and that they also used more scientific and artificial intelligence jargon. We also found that accepted manuscripts were written with words that are less frequent, that are acquired at an older age, and that are more abstract than rejected manuscripts. The analysis of references included in the manuscripts revealed that the subset of accepted submissions were more likely to cite the same publications. This finding was echoed by pairwise comparisons of the word content of the manuscripts (i.e., an indicator of semantic similarity), which were more similar in the subset of accepted manuscripts. Finally, we predicted the peer review outcome of manuscripts with their word content, with words related to machine learning and neural networks positively related to acceptance, whereas words related to logic, symbolic processing, and knowledge-based systems negatively related to acceptance

    Estimating Open Access Mandate Effectiveness: The MELIBEA Score

    Get PDF
    MELIBEA is a Spanish database that uses a composite formula with eight weighted conditions to estimate the effectiveness of Open Access mandates (registered in ROARMAP). We analyzed 68 mandated institutions for publication years 2011-2013 to determine how well the MELIBEA score and its individual conditions predict what percentage of published articles indexed by Web of Knowledge is deposited in each institution's OA repository, and when. We found a small but significant positive correlation (0.18) between MELIBEA score and deposit percentage. We also found that for three of the eight MELIBEA conditions (deposit timing, internal use, and opt-outs), one value of each was strongly associated with deposit percentage or deposit latency (immediate deposit required, deposit required for performance evaluation, unconditional opt-out allowed for the OA requirement but no opt-out for deposit requirement). When we updated the initial values and weights of the MELIBEA formula for mandate effectiveness to reflect the empirical association we had found, the score's predictive power doubled (.36). There are not yet enough OA mandates to test further mandate conditions that might contribute to mandate effectiveness, but these findings already suggest that it would be useful for future mandates to adopt these three conditions so as to maximize their effectiveness, and thereby the growth of OA.Comment: 27 pages, 13 figures, 3 tables, 40 references, 7761 word

    Covid-19: where is the data?

    Get PDF
    The arrival of the COVID-19 pandemic has led many to argue that scholarly communication and publishing is undergoing a revolution, in terms of not only the wider opening of access to research, but also the data underlying it. In this post Julien Larrègue, Philippe Vincent-Lamarre, Frédéric Lebaron, and Vincent Larivière, discuss findings from their study of papers submitted to the preprint server medRxiv, which shows levels of open data to be stubbornly lo

    Psycholinguistic Correlates of Symbol Grounding in Dictionaries

    No full text
    A dictionary can be represented as a directed graph with links from defining to defined words. The minimal feedback vertex sets (MinSets, Ms) of a dictionary graph are the smallest sets of words from which all the rest can be defined. We computed Ms for four English dictionaries. The words in the dictionary components revealed by our graph-theoretic analysis differ in their psycholinguistic correlates. Every MinSet has a C-part that is younger and more frequent and an S-part, that is more concrete. To understand the functional role of these components will require a close study of the words themselves, and how they are combined into definitions. We can already conclude that the closer a word is to the MinSets that can define all other words, the more concrete and frequent the word is likely to be, and the earlier it is likely to have been learned. This is what one would expect if the words in the MinSets were the ones that had been acquired through direct sensorimotor grounding

    The Latent Structure of Dictionaries

    No full text
    How many words (and which ones) are sufficient to define all other words? When dictionaries are analyzed as directed graphs with links from defining words to defined words, they reveal a latent structure. Recursively removing all words that are reachable by definition but that do not define any further words reduces the dictionary to a Kernel of about 10%. This is still not the smallest number of words that can define all the rest. About 75% of the Kernel turns out to be its Core, a Strongly Connected Subset of words with a definitional path to and from any pair of its words and no word’s definition depending on a word outside the set. But the Core cannot define all the rest of the dictionary. The 25% of the Kernel surrounding the Core consists of small strongly connected subsets of words: the Satellites. The size of the smallest set of words that can define all the rest (the graph’s Minimum Feedback Vertex Set or MinSet) is about 1% of the dictionary, 15% of the Kernel, and half-Core, half-Satellite. But every dictionary has a huge number of MinSets. The Core words are learned earlier, more frequent, and less concrete than the Satellites, which in turn are learned earlier and more frequent but more concrete than the rest of the Dictionary. In principle, only one MinSet’s words would need to be grounded through the sensorimotor capacity to recognize and categorize their referents. In a dual-code sensorimotor-symbolic model of the mental lexicon, the symbolic code could do all the rest via re-combinatory definition

    The effect of Open Access mandate strength on deposit rate and latency

    No full text
    Access-denial because of the high cost of journal subscriptions is a major obstacle to the progress of research, universities and research funders need to adopt mandates that require their researchers to make their published papers Open Access (OA) by depositing them in their Institutional Repositories. To measure the effectiveness of these mandates MELIBEA has ranked and weighted OA mandates according to their specific requirements, and assigned them an overall score for strength. There is a weak but significant positive correlation between the MELIBEA overall weighted score for mandate strength and the Public Access (PA) deposit rate. If the policy stipulates that deposit is mandatory “For internal use” (e.g., research performance evaluation) deposit rate is significantly higher for PA and RA (Restricted Access) combined; deposit latency for PA alone is also significantly shorter. Finally, if the policy requires that the deposit must be done “At time of acceptance,” deposit rate is significantly higher for combined PA and RA deposits, compared to requiring deposit “At time of publication” or “Unspecified.” This effect is significant only for 2011 and almost significant for all years combined

    Predatory publishers’ latest scam : bootlegged and rebranded papers

    Full text link
    To thwart publishing rackets that undermine scholars and scholarly publishing, legitimate journals should show their workings

    Improving reproducibility in machine learning research : a report from the NeurIPS 2019 reproducibility program

    Get PDF
    One of the challenges in machine learning research is to ensure that presented and published results are sound and reliable. Reproducibility, that is obtaining similar results as presented in a paper or talk, using the same code and data (when available), is a necessary step to verify the reliability of research findings. Reproducibility is also an important step to promote open and accessible research, thereby allowing the scientific community to quickly integrate new findings and convert ideas to practice. Reproducibility also promotes the use of robust experimental workflows, which potentially reduce unintentional errors. In 2019, the Neural Information Processing Systems (NeurIPS) conference, the premier international conference for research in machine learning, introduced a reproducibility program, designed to improve the standards across the community for how we conduct, communicate, and evaluate machine learning research. The program contained three components: a code submission policy, a community-wide reproducibility challenge, and the inclusion of the Machine Learning Reproducibility checklist as part of the paper submission process. In this paper, we describe each of these components, how it was deployed, as well as what we were able to learn from this initiative

    Écologie de la reproduction du faucon pèlerin au Nunavut

    No full text
    Le déclin historique du faucon pèlerin (Falco peregrinus) observé en Amérique du Nord au milieu du xxe siècle a été principalement attribué à l’échec de reproduction causé par les polluants organochlorés persistants. C’est dans ce contexte que le Arctic Raptor Project a été initié, en 1982, dans le but d’étudier la reproduction de faucons pèlerins F.p. tundrius nichant dans l’Arctique. Nous présentons ici une synthèse des principaux travaux conduits dans le cadre de ce programme de recherche réalisé essentiellement dans la région de Rankin Inlet, mais aussi plus récemment près d’Igloolik et sur l’île de Baffin au Nunavut. Des résultats portant sur le régime alimentaire, la phénologie de la reproduction, la croissance et la survie des jeunes, ainsi que sur la dynamique de population sont présentés. Le suivi à long terme dans la région de Rankin Inlet a permis de mettre en lumière, entre autres, une baisse du nombre de jeunes au cours des 3 dernières décennies. Des épisodes de fortes précipitations estivales, plus fréquents dans l’aire d’étude ces dernières années, semblent en partie responsables de ces diminutions. En outre, l’étude des rapaces nichant dans l’Arctique est cruciale pour comprendre les conséquences sur la dynamique des populations, notamment des changements climatiques, de l’environnement (p. ex. diminution des polluants organochlorés) et de la structure et du fonctionnement de l’écosystème arctique.The historical decline of the peregrine falcon (Falco peregrinus) in North America during the 20th century was mainly attributed to reproductive failure due to the accumulation of persistent organochloride pollutants. As a direct result to this finding, the Arctic Raptor Project was established in 1982, and its goal was to monitor the breeding success of Arctic-nesting peregrine falcons (F.p. tundrius). The present article provides a synopsis of the major findings of its research, which was principally conducted around Rankin Inlet (Nunavut), but also, more recently, around Igloolik and on Baffin Island (Nunavut). The results cover raptor feeding regimes, reproductive phenology, growth and survival of young, and population dynamics. The long-term Rankin Inlet study has identified, among other things, a decrease in the number of young fledged over the past 3 decades. Episodes of heavy summer rain, which have occurred more frequently in recent years, appear, in part, to be responsible for this decline in reproductive output. The continued study of Arctic-nesting raptors is crucial to our understanding of population dynamics, including how these are affected by changes in climate and in the environment (e.g., reductions in organochloride pollutant levels), and on the structure and functioning of the Arctic ecosystem
    corecore